vignette-060-spread-figure.RmdMap of SLF Spread Through Time in the U.S.
We read in the packages to access data and generate the spread map. We also set the option to not download counties shapefiles every time (only the first).
library(slfrsk) #this package, has extract_enm()
library(tidyverse) #data manipulation
library(here) #making directory pathways easier on different instances
library(sf) #simple features for easy shapefile manipulation
library(magrittr) #fast pipe assign
library(RColorBrewer) #palette
#packages to use to extract raw data
#library(lubridate) #extract years
#library(tigris) #county level shapefiles
#library(lycormap) #package to get tinyslf dataset from
#options(tigris_use_cache = TRUE) # make sure that tigris does not keep downloading the filesThe data used here for the spread figure are based on an upcoming
suite of packages (lycodata and lycormap)
produced by the iEcoLab. The records in these packages contain data that
are not yet available to the public, so all questions regarding the raw
data should be directed to the creator of the lycor* suite,
Seba De Bona (SDB, (sebastiano.debona@temple.edu))
or to Matthew R. Helmus (mrhelmus@temple.edu).
Here, we will read in a .rda file that was created using
the example code below from the raw data. These example code chunks are
adapted from SDB code.
#data call
data(by_county, package = "slfrsk")We obtain the counties shapefile as a simple feature (sf) from the
tigris package.
counties <- counties(cb = T, resolution = "5m")Now, we read in the SLF records from the package
lycormap. We also obtain the supplemental SLF records that
contain the updated transportation data for uninvaded states, which was
obtained from the raw data in the lycodata package. At the
time of publication of slfrsk, additional data were saved
as raw data that comes with lycordata
(slf_county_record_timeline.csv).
data("tinyslf", package = "lycormap")
# reading manually added data (for a later step)
man_county_data <- read_csv(file.path(here(), "..", "..", "lycordata", "data-raw", "additional", "slf_county_record_timeline.csv"),
col_types = cols(.default = "c", FIPS = "d",
RT_check = "l", ST_check = "l", RS_match = "l"))This code finds the county that each presence point corresponds to and attaches the county shapefile data to those rows. It also removes records that do not fall in a U.S. county (likely erroneous records). We also remove an errant record for Wasatch county, UT.
#adapted from SDB code in lycordata
#add column of temporary row ID's for later reference
tinyslf %<>%
add_column(row_ID = 1:nrow(.))
tinyslf %<>%
# reducing data to coordinates only, and ID for future merging
dplyr::select(latitude, longitude, row_ID) %>%
# transforming into sf object
st_as_sf(coords = c("longitude", "latitude"),
crs = st_crs(counties)) %>%
#intersecting state polygons with data coordinates to match them to U.S. counties
st_join(., counties, join = st_intersects) %>%
#make it a tibble now
as_tibble() %>%
#simplify to row ID, county name and identity
dplyr::select(row_ID, county = NAME, GEOID) %>%
#then joining this into main tinyslf data
left_join(tinyslf, ., by = "row_ID") %>%
#remove the temporary id column
dplyr::select(-c(row_ID))
#remove the NA values for county. which are the points that do not fall in a county
tinyslf %<>%
filter(!is.na(county))
#rm errant Wasatch co, UT record
tinyslf %<>%
filter(!(state %in% "UT" & county %in% "Wasatch"))Add the observations and establishment dates to the county outline
data. Now, the earliest year of establishment or recording of an
observation (dead or alive) are identified from the tinyslf
dataset and then added accordingly to county shapefile data. Lastly, the
projection is confirmed.
#get the year of establishment
by_county <- tinyslf %>%
group_by(GEOID) %>%
arrange(year) %>%
filter(slf_established) %>%
summarize(YearOfEstablishment = min(year)) %>%
ungroup() %>%
#join to the counties shapefile
left_join(counties, ., by = "GEOID")
# and add the first record year
by_county <- tinyslf %>%
group_by(GEOID) %>%
arrange(year) %>%
filter(slf_present) %>%
summarize(FirstRecord = min(year)) %>%
ungroup() %>%
#join to shapefile that also includes year of establishment
left_join(by_county, ., by = "GEOID")
# providing correct projections
by_county %<>%
st_transform('+proj=longlat +datum=WGS84')Because our first record and establishment data are constantly
updating from multiple sources, we have a googlesheet that is the source
of man_by_county (saved as
slf_county_record_timeline.csv). We again adapt code from
SDB to tidy and splice these new records into the counties dataset.
#we can use the FIPS code to merge to the main data
#extract the status and the year of record/establishment
#there are three date columns that hopefully can be collapsed into 1
#standardize date format
man_county_data %<>%
mutate_at(vars(starts_with("Date")), .funs = ~parsedate::parse_date(.))
#data can have multiple dates associated to it, meaning a county can have info on when the first record was scored separately from establishment date
#not all infestations have an establishment data set, so cases where it is absent will be set to 2020 based on when data were analyzed
# for Date_Alive/Date_Morbound, we'll take the smallest, if both are present
man_county_data %<>%
mutate(Year_Alive = year(Date_Alive),
Year_Morbund = year(Date_Morbund),
Year_FirstRecord = ifelse(!(is.na(Year_Alive) & is.na(Year_Morbund)),
pmin(Year_Alive, Year_Morbund, na.rm = T),
NA)
) %>%
dplyr::select(-c(Year_Alive, Year_Morbund))
#create two variables that echo those in the by_county for years of first record and establishment
man_county_data %<>%
filter(!is.na(Status)) %>%
#first grabbing the date (if present) from the Date_Establish or Date_Alive/Morbound columns or setting 2020 to established records without dates
#establishment
mutate(YearOfEstablishment_man = ifelse(
#if infestation and has a date of establishment
Status == "Infestation" & !is.na(Date_Establish),
year(Date_Establish),
#if infestation and no date, sets to 2020
ifelse(Status == "Infestation" & is.na(Date_Establish), 2020, NA)),
#first record
FirstRecord_man = ifelse(
#not infested and has a year of first record
Status != "Infestation" & !is.na(Year_FirstRecord),
Year_FirstRecord,
#infested and does not have a year of first record is set to 2020
ifelse(Status == "Infestation" & is.na(Year_FirstRecord), 2020, NA))) %>%
dplyr::select(-Year_FirstRecord)Now that both datasets are cleaned to have years of first record and establishment, they can be combined. A combined FIPS code is added to the full county dataset first.
#create combined FIPS columns
by_county %<>%
mutate(FIPS = as.numeric(paste0(STATEFP, COUNTYFP)))
#merge
by_county <- man_county_data %>%
dplyr::select(FIPS, YearOfEstablishment_man, FirstRecord_man) %>%
left_join(by_county, ., by = "FIPS")There may be some duplicate rows, and so the first record and establishment rows are retained. This dataset contains incomplete data for 2021, so we recode these as well, since our data are for as of 2020.
#pick earliest year (between the records mined from the data and the manually compiled records)
by_county %<>%
mutate(YearOfEstablishment = pmin(YearOfEstablishment, YearOfEstablishment_man, na.rm = T),
FirstRecord = pmin(FirstRecord, FirstRecord_man, na.rm = T)) %>%
#recode the 2021 data
#get cases where the first record is 2021, and set record to NA
mutate(FirstRecord = case_when(
FirstRecord == 2021 ~ NA_real_,
TRUE ~ FirstRecord
),
#get cases where the first record or year of establishment is 2021 and set establish to NA
YearOfEstablishment = case_when(
FirstRecord == 2021 ~ NA_real_,
YearOfEstablishment == 2021 ~ NA_real_,
TRUE ~ YearOfEstablishment
)
)The data are read in above, and the year columns are also transformed into factors for easier plotting.
# transforming records and establishment
by_county %<>%
mutate(YearOfEstablishment = as.factor(YearOfEstablishment),
FirstRecord = as.factor(FirstRecord))Lastly, we plot the data after identifying a set of focal counties to
outline (this bypasses layering issues with sf objects and
plotting with ggplot2).
#get the focal counties to outline
focal_counties <- by_county %>%
filter(!is.na(FirstRecord)) %>%
as.data.frame(.) %>%
dplyr::select(GEOID) %>%
unlist()
map_plot <- ggplot() +
#geom_sf(data = by_county %>% filter(!GEOID %in% focal_counties), fill = "#FFFFFF", size = 0.5) +
geom_polygon(data = map_data('state'), aes(x = long, y = lat, group = group), fill = "#FFFFFF", color = "black", lwd = 0.10) +
geom_polygon(data = map_data('state', c("Connecticut","Delaware", "Maryland", "New Jersey", "New York", "Ohio", "Pennsylvania", "Virginia", "West Virginia")), aes(x = long, y = lat, group = group), fill = NA, color = "#e31a1c", lwd = 0.75) +
geom_sf(data = by_county %>% filter(GEOID %in% focal_counties), aes(color = FirstRecord, fill = YearOfEstablishment), show.legend = T) +
coord_sf(xlim = c(-124.7628, -66.94889), ylim = c(24.52042, 49.3833), expand = FALSE) +
labs(x = "", y = "") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = "aliceblue")) +
scale_color_brewer(palette = "YlOrRd", direction = 1, name = "Regulatory Incident", guide = guide_legend(override.aes = list(fill = "#808080"))) +
scale_fill_brewer(palette = "YlOrRd", direction = 1, name = "Establishment", na.value = "#808080", breaks = c(2014,2015,2016,2017,2018,2019,2020)) +
ggtitle(label = "")
map_plot
We also include a modification of SDB code here to show how to make a
nice leaflet interactive map too… but we do not use this
code.
library(leaflet)
#> Warning: package 'leaflet' was built under R version 4.0.5
colpal <- colorFactor(c(brewer.pal(7, "YlOrRd")),
by_county$YearOfEstablishment,
na.color = "transparent")
fillpal <- colorFactor(c(brewer.pal(7, "YlOrRd")),
by_county$FirstRecord,
na.color = "transparent")
leaflet(data = by_county,
options = leafletOptions(minZoom = 4, maxZoom = 18)) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(label = ~NAME,
color = ~colpal(FirstRecord),
weight = 2,
smoothFactor = 0.5,
opacity = 0.5,
fillOpacity = 0.5,
fillColor = ~fillpal(YearOfEstablishment)) %>%
setView(lat = 40.80, lng = -100, zoom = 4) %>%
addLegend("bottomleft", pal = fillpal,
title = "Invaded since",
values = ~YearOfEstablishment,
opacity = 1
)